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Abstract. Using the method of transportation-information inequality introduced in 
|28j . we establish Bernstein type's concentration inequalities for empirical means j J* g(X s )ds 
where g is a unbounded observable of the symmetric Markov process (X t ). Three ap- 
proaches are proposed : functional inequalities approach ; Lyapunov function method ; 
and an approach through the Lipschitzian norm of the solution to the Poisson equation. 
Several applications and examples are studied. 
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1. Introduction 

1.1. Bernstein's concentration inequality for sequences of i.i.d.r.v. Let us begin 
with the classical Bernstein's concentration inequality in the i.i.d. case. Consider a 
sequence of real valued independent and identically distributed (i.i.d.) random variables 
(r.v.) (£,k)k>i, copies of some r.v. £, all defined on the probability space (Q, J 7 , P) such 
that El = and E£ 2 = a 2 > 0. 



Theorem 1.1. If there is some constant M > such that 
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The last inequality (jl.4p is the original version of Bernstein's inequality. The proof of 
fll.2p is very easy : just apply Chebychev's inequality to obtain : Vr, A > 0, 

P fn^-^ fc > r ) - e_nArEex P f A X^J - e- n[Xr - A{x)] 

and then optimize over A G (0,1/M). We refer to E. Rio [44] or P. Massart [38] for 
known sufficient conditions for the verification of (jl.ip . For instance (11.11) is verified with 
M = ||^ + ||oo/3 if C, is upper bounded, or for some not very explicit constant M > if 
A(A) < +oo for some A > 0. Bernstein's concentration inequality is one of the most 
powerful concentration inequalities in probability, which is sharp both in the central limit 
theorem scale and the moderate deviation scale. This type of inequalities have had many 
applications, and are now particularly used in (non asymptotic) model selection problem, 
see Massart [38J or Baraud [7]. 

There are already many works on the generalization of Bernstein's inequality in the 
dependent case: Markov process or weakly dependent one. The strategy however remains 
the same : control the Laplace transform of partial sums. In the markovian context, 
Lezaud [M] used Kato's perturbation theory to get result in presence of a spectral gap, 
whereas Cattiaux-Guillin [15] (building on Wu [51]) used functional inequalities for the 
Laplace control or for the control of the mixing coefficients. More recently, Adamczak [1] , 
Bertail-Clemengon [SJ, Merlevede-Peligrad-Rio [3H] used a block strategy and then results 
in the independent case. Note however that, except the symmetric Markov processes case 
studied by Lezeaud [31], the known results do not reach the tight form (II. 2p or (11.41) . 
Our major objective is to give practical conditions ensuring this sharp form (ll.2p in the 
context of integral functional of symmetric Markov processes. 

There are two modern approaches to concentration inequalities. The first one, initiated 
by Ledoux, relies on functional inequalities, such as Poincare or logarithmic Sobolev 
inequality (see for example [2j or [33J) and has attracted a lot of attention in the past 
decade: Wu [51] or Cattiaux-Guillin [15] used them in the continuous time context to 
get precise control of the Laplace transform of the partial sums, see also Massart [38] 
for the entropy method for various type of dependance in the discrete time case; another 
approach was to get a functional inequality for the whole law of the process and Herbst's 
like argument, note however that at this level of generality, the precise form of Bernstein's 
inequality has not been achieved yet. 

The second approach is centered on the use of transportation inequalities ( see precise 
definition in section 2 below): bounding Wasserstein's distance by some type of informa- 
tion (Kullback or Fisher). If originally investigated by Marton [361 EZ] or Talagrand [16] 
for concentration, its systematic study is more recent, starting from the pioneer work of 
Bobkov-Gotze [TO], followed by an abundant litterature, see [121 El EB3 III (H EBJ with 
Kullback information, and (2BJ ESI [3D] for Fisher information. If the use of Kullback 
information at the process level may lead to deviation inequality for integral functional 
of Markov processes (see [18] for example), the precise form of Bernstein's inequality is 
not reachable. We will therefore use here transportation inequalities with respect to the 
Fisher information, which are more natural for Markov processes : the Fisher informa- 
tion is exactly the large deviations rate in the Donsker-Varadhan theorem for symmetric 
Markov processes (see [201 12H [221 M, E2]). 
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But before going further into the details, let us present the framework on symmetric 
Markov processes. 

1.2. Symmetric Markov processes. Let A 1 be a Polish space with Borel field B. 
Let (X t )t>o be a A'-valued cddldg Markov process with transition probability semigroup 
(P t ) which is symmetric and strongly continuous on L 2 (fi) := L 2 (E,B,fi), defined on 
(Q, J 7 , (f x )xex) (Px(^o = x) = 1, Vx G X), where fi is a probability measure on 
(X,B), written as /i G Aii(X). For a given initial distribution (3 G Aii(X), write := 
f x (3(dx)F x (-). Let £ be the generator of (Pt), whose domain in LP(ii) = L P (X, B, fi) {p G 
[l,+oo]) is denoted by D p (£). It is self-adjoint, definitely non-positive on L 2 (fi). Let 

r+oo 

-£= \dE\ 
Jo 

be the spectral decomposition of — £ on L 2 (ji). The Dirichlet form S(f,g) is defined by 
B(8) = D 2 (v /= Z) = |/i G L 2 (^); £ Xd{E x h, h) ^ < +oo| 

£(f, g) = (>/=£/, v^^)^ = / \d(E\f, g)„ f, g G D(£) 

where (f,g)n = f x fgdfi is the standard inner product on L 2 ([i). 
We will study here deviation inequalities for 



*.#„ 



g(X s )ds 



for some /^-centered function ^ (observable). It is quite natural to expect conditions 
relying on an interplay between the type of ergodicity of our Markov process and the type 
of boundedness or integrability of the function g. 

That is why a long standing assumption in this paper will be the following Poincare 
inequality : for some finite nonnegative best constant Cp, 

VaxM<c P £(f,f), V/GD(f). (1.5) 

Here and hereafter fi(f) := J x fdfi and Var At (/) = /i(/ 2 ) —fi(f) 2 is the variance of / under 
\i. Poincare's inequality is equivalent to the exponential decay of P t to the equilibrium 
invariant measure \x in L 2 (fi) : 

Var M (P t /) < e- 2t / c -Var p (/), V/ G L 2 (/x). 

It is also equivalent to say that the spectral gap 

Ai := sup{A > 0; E x - E = 0} = — > 0. 

c P 

Let us first show why this Poincare inequality condition is natural in our context. 
Indeed, the first class of test function g that can be considered is the class of bounded 
ones. Using Kato's theory about perturbation of operators combined with ingenious 
and difficult combinatory calculus, Lezaud [31] proved the following Bernstein type's 
concentration inequality. 
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Theorem 1.2. ([IS]) Let g be a bounded and measurable function (say g G bB) such that 
n(g) = 0. Then for f3 <C \i, 



P,(^/^)^>r)<||||| 2 e X p 



/ 2 \ 

2tr 2 



1.6) 



d/3 ( tr 2 

^ l|2eXP V~ 2(^ + Mr) ,)' V ^ r> ° 



where M = M(g) = cpH^H^ and a 2 zs t/ie asymptotic variance (in the CLT) of the 
observable g G L 2 (fi), given by 

a 2 = a 2 (g) := ( hrn^ ^Var^ </(X.)ds) = 2 {P t g,g)^dt. (1.7) 



For generalization of this result see Cattiaux-Guillin [15] , Guillin-Leonard- Wu-Yao [28] 
etc. Notice a remarkable point : (II. 6p is sharp both for the central limit theorem (CLT) 
scale r oc l/y/t (since ^= J g(X s )ds converges in law to the centered Gaussian distribution 

with variance c 2 (g), see [31]), and for the moderate deviation scale (i.e. l/y/t < r < 1) 
by the moderate deviation principle due to [50] . 

Notice that if o~ 2 (g) < CWgW 2 ^ for some constant C > and for all g G bB with /z(g) = 0, 
then the Bernstein's concentration inequality (11.61) implies the Poincare inequality (II. 5ft . 
by [2H1 Theorem 3.1]. In other words the Poincare inequality is a minimal assumption for 
Bernstein's concentration inequality for all bounded observables g. 

Remark 1.3. Let us point out that for bounded g, the assumption that cr 2 (g) < CWgW 2 ^ 
is a weak one, as by definition (jl.7p 

<y\g) < %||oc / V^^Ptg) 1/2 dt. 

Jo 

Assume now that a weak Poincare inequality holds (see [5 J for example), or a Lyapunov 
condition, i.e. CV < —<f)(V) + blc for some sub linear (see [23] for details), ensuring 
that Var M (P t g) < ^(Olblloo w hh J Q S ip( s Y^ 2 ds < oo, then the Poincare inequality holds 
under Bernstein's type inequality. We refer to the last section for some examples of this 
Lyapunov condition. 

1.3. Main question and organization. The main question we will focus on in this 
paper will be: what is the interplay between the ergodic properties of the symmetric Markov 
process and the test function g? Or more precisely, how to bound the constant M 
(appearing in (jl.6p ) by means of other quantities than H^Hoo and cp? 

In fact we shall answer this question by a very simple approach : instead of a direct 
control of the Laplace transform of partial sums, we use the method of transportation- 
information inequality introduced by Guillin-Leonard- Wu-Yao [25] . 

This paper is organized as follows. In the next section we describe the strategy and the 
main idea of this work, giving by the way another proof of Theorem 1 1 . 21 with a better esti- 
mate of M. The goal of the three following sections is to generalize Bernstein's inequality 
to unbounded case. We present three approaches : (1) functional inequalities such as 
log-Sobolev inequality or $-Sobolev inequality ; (2) the Lipschitzian norm ||(— £) -1 g|| £ j p ; 
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and (3) Meyn-Tweedie's Lyapunov function method. Finally the last section is dedicated 
to the case where Poincare inequality does not hold anymore, and the class of bounded 
test functions is now too large. Once again, the approach via Lyapunov function will be 
particularly efficient. 

Note that, from Section 2 through 5, we assume implicitly that the previous Poincare 
inequality is satisfied. 

Before going to the job let us fix some more notations. For p G [l,+oo], || • \\ p is 
the standard norm of L p (fi) := L P (X,B, Li), and L^(fi) := {g G L p ([i); /i(g) = 0}. The 
quantity a 2 denotes always the asymptotic variance <J 2 (g) in the CLT, given by (II. 7p . The 
empirical measure j J^Sx^s (S x being the Dirac measure at point x) is denoted by L t , so 
that \ J t Q g(X s )ds = L t (g). 

2. A TRANSPORTATION-INFORMATION LOOK AT BERNSTEIN'S INEQUALITY 

2.1. The strategy and the main idea. As in [28], our starting point is 
Theorem 2.1. (Wu [51]) Let g G Lj(/x). Then 

Fp(jj*g(X s )ds>r} < \\^-\\ 2 e' tI( ~ r -\ Vt,r>0 (2.1) 



where 



and 



I(r) := inf {/(z/|/i); v(\g\) < +oo, v{g) = r}, I Or—) := lim I Or — e), r G 

. = [ £ (v 7 /, v 7 /) , if v = f», v7 e B(£), 
I +oo, otherwise 



(2.2) 



is the Fisher-Donsker-Varadhan's information of v with respect to (w.r.t.) fx. 

By the large deviations in Donsker-Varadhan [20] |2T] (in the regular case) and Wu [52] 
(in full generality), v — > I{y\\i) is the rate function in the large deviations of the empirical 
measures L t : = | f Sx 3 ds, and the Cramer type's inequality (12. ip is sharp for large time 
t. The main problem now is to estimate the rate function J(r) in the large deviations of 
| Jq g(X s )ds : that is exactly a role that the transportation-information inequality plays. 

Theorem 2.2. ([2H1 Theorem 2.4]) Let g G and a : M — > [0, +oo] fre a nondecreasing 

left- continuous convex function with a(0) =0. T7ie following properties are equivalent : 

(a) a(u(g)) < I(v\ii), W G Afi(Af) suc/i £/ia£ KM) < +oo. 

(b) u(g) < a _1 (/(i/|/i)), W G A^i(A') smc/i that u(\g\) < +oo, where a _1 (x) := inf{r G 
K; a(r) > x} is the right inverse of a. 

(c) It holds that 

{^j\{X s )ds > < |||||| 2 e- to M, Vt, r > 0. (2.3) 



(d) It holds that 

P/3 



(j J g(X s )ds > a-^x)^ < \\j^he~ t:E , Vt,x>0. (2.4) 
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(e) For any X > 0, 

A(A#):=supjy \gh 2 dfi — £(h, h)\h G H>(£) , fi(h 2 ) = 1 1 < a*(A) (2.5) 

where a* (A) := sup r>0 {Ar — a(r)} is the (semi)-Legendre transformation of a. 

It is not completely contained in [281 Theorem 2.4] (the condition (A2) therein is not 
satisfied), but the proof there works. Indeed (a) (6) and (c) (d) are obvious. We 
give the proof of the crucial implication (a) ==>- (c) for its simplicity. In fact by the 
transportation-information inequality in (a), we have for r > 0, 

J(r) = M{I(y\fi); u(\g\) < +oo, u{g) = r} > a(r) 

and then J(r— ) > «(r) by the left-continuity of a. Hence the concentration inequality 
( 12. 3D follows immediately from ( 12.11) . 

Remark 2.3. By Rayleigh's principle, A(Xg) is the supremum of the spectrum of the 
Schrodinger operator £ + Xg (in the sum-form sense). 

Bernstein's inequality (11.61) is just (I2.3P with 

2r 2 

a(r) = l r >o 



Since a x (x) = \j2o~ 2 x + Mi for x > 0, by Theorem 12.21 Bernstein's inequality (I1.6P is 
equivalent to 



u(g) < V2a 2 I + MI, I := I{u\fi), Wu G A^i(Af) so that u{\g\) < +oo. (2.6) 

That is the strategy of this work. 

Now let us present a very simple proof of Lezaud's result, which illustrates also the 
main idea for our approaches to establish (12.61) . Assume g G L^fj) so that g + G L°°(fj,). 

Let v = ffi and h = v7 G D(£) (trivial otherwise for / = +oo) such that u(\g\) < +oo. 
Our main idea resides in the following simple but key decomposition : 



u (d) = \ gh 2 d[i = I g [(h — /i(/i)) 2 + 2fx(h)h\ d[i (since fj,(g) = 0) 
Jx Jx 

= 2 f i(h){g,h) tt + [ g(h-fi(h)) 2 dfi=:A + B. 
Jx 



(2.7) 



Bounding A. 

For the first term A = 2fi(h)(g, h) ^ note that fj,(h) < \J ^{h 2 ) = 1. Let (— £) _1 g = 
j+oo p^g^ kg ^ ne Poisson operator (the integral is absolutely convergent in L 2 (fi) for all 
g G Lq(ix) by the Poincare inequality). Hence 

POO 

a 2 = a 2 (g) = 2 / (P t g, g)dt = 2<(-£)~ 1 < ? , g)^. 
Jo 

By Cauchy-Schwarz, we have 



\(g,h),\ < V((-£)- 1 g,g)£(Kh) = x j-i {21 



a 2 
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Hence \A\ < y/2a 2 I, in other words, the term A is always bounded by the first term at 
the right hand side of the inequality (|2.6p . 

Remark 2.4. Even without the hypothesis of the Poincare inequality, $2.8\) is still true 
for g G L^(fi) by Kipnis-Varadhan |31J once if a 2 (g) = 2 J^°(g,P t g)dt < +oo. The latter 

condition is the famous sufficient condition of Kipnis-Varadhan for the CLT of J Q * g(X s )ds. 
Bounding B. 

Now for f)2.6p it remains to prove that the second term B satisfies 

B = I g[h-fi( y h)] 2 dfi<M£( y h,h) = MI. (2.9) 
Jx 

It is indeed very easy in terms of ||g||oo : letting g + = max{g, 0}, we have by Poincare, 

B= I g[h- fx(h)] 2 dfx< [ g + [h- fi(h)} 2 dfi< ||^+|| 0O Var M (/i) < cpWg+W^I. 
J x J x 

In other words we have proven (I2.6P with M = cj=> || f? + 1| oo 5 which is a little better than 
Lezaud's estimate M = cpH^H^. We summarize the discussion above as 

Proposition 2.5. Let g G bB with fi(g) = 0. Then $2.6}) holds with M = cp\\g + \\ OQ , or 
equivalently Bernstein's inequality jil.o]) holds with such M. 

Our remained task consists in proving (12. 9p with some constant M = M(g) for various 
classes of functions g under different ergodicity conditions for the process. Remark that 
the best constant M(g) for (I2.9P (or (I2.6P ) is positively homogeneous, i.e. M(cg) = cM(g) 
for all c > 0. 

2.2. Approach by transportation-information inequality T C I. Let us introduce our 
first approach by means of the transportation- information inequality T C I in |28j. 

Consider a cost function c : X 2 — > [0, +oo] which is always lower semi-continuous 
(l.s.c.) and c(x,x) = for all x G X, here c(x,y) represents the cost of transporting a 
unit mass from x to y. Now given two probability measures u, fi G Aii(X), we define the 
transportation cost from v to [i by 



T c (i/,/i):= inf // c(x,y)ir(dx, dy) (2.10) 

TreC(^At) J J X 2 

where C(z/, fi) is the family of all couplings of (z/, fi), i.e. all probability measures ir on X 2 
such that n(A x X) = u(A), tt(X x B) = fi(B) for all A,B e B. 

Let d(x,y) be a l.s.c. metric on X, which does not necessarily generate the topology of 
X. For any p > 1, the quantity 

W Pld {v,fi):=(T dp (u, f i)) 1/p = ( inf // d p (x, y^dx, dy)) ^ (2.11) 

\neC(u tt j.) J J X 2 J 

is the so called L p -Was ser stein distance between v and fx. W p ^ is a metric on Mi(X) : = 

{v G Ai\(X); (J x d p (xo, x)v(dx)) ^ P < +00} (x$ G X is some fixed point). We refer to 
the recent books of Villani [4"8l Wf\ for more on this subject. 

An important particular case is d(x,y) = l x ^ y , the trivial metric on X. In that case 



s 
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W X4 (v,n) = \\\v- h\\tv = sup \v{A) - n(A)\ (2.12) 

where ||m||7v = su P/e6B,|/|<i l m (/)l i s the total variation of a signed bounded measure m 
on X. More generally given a positive continuous weight function <f), consider the distance 
d<t,(x,y) = l x ^ y [4>{x) + <j)(y)], then (cf. [26]) 

Wi^(u,fi) = \\<f>{v - h)\\tv- 

Theorem 2.6. Assume the following transportation-information inequality 

a{T c {y, //)) < I{v\ii), W £ Mi{X) (2.13) 

where a is nonnegative, nondecreasing convex and left continuous with a(0) = such that 
its right inverse a -1 is concave and a _1 (0) = 0. Then for every measurable g G Lq(ji) 
such that its sup- convolution 

g*(y) = sup (g(x) - c(x, y)) , yeX (2.14) 

xGX 

is in L l ({i), 112.6)) and Bernstein's inequality U.6\) hold with 

M(g) = fi(g*)c P + cpa- 1 (J-^j . (2.15) 

In particular if the W\l -transportation-information inequality below holds 

Wl d (iy,/i) < 2cgI{v\»), Mu e M 1 (X) (2.16) 
then $2.6}) holds for every d-Lipschitzian function g (with fi{g) = 0) with 



M {g) = \\g\\Lip(d)V2c P c G . 

Proof. At first g*(y) > g(y), y £ X, so p,(g*) > n{g) = 0. For (12 .6p we may assume that 
v = h 2 /u, with < h G D(£) and Var^(h) ^ (trivial otherwise for v — fi). Letting 
h = h — fi(h) and v := /i 2 /i/Var At (/i), we have by the very definition of T c , 



g(x)u(dx) < / g*(y)n(dy) + T c (u,fi) 
x J x 

£{h,h) 



< fi(g*) + a~ l (I(i>\n)) < n(g*) + cT 1 



4 Var^(/i) 

where we have used \h\) < £(h, h) = £(h, h). It follows by the concavity of a -1 , 

B = jf gh 2 dfx < n(g*)VaiM + Var^aT 1 (^j^j < K9*)c P I + CpIaT^l/cp) 
the desired (1231) . 

For the last particular case we may assume that ||g||Lip(d) = 1. In that case g* = g, and 
then one can apply (I2.15p . □ 

Remark 2.7. By the preceding result, one can apply the criteria for T C I or W\ /-transportation 
information inequalities in (28] to obtain Bernstein's inequality. 
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3. Functional inequalities approach 

3.1. Log-Sobolev inequality. Recall that for < / G the entropy of / w.r.t. \i 
is defined by 

Ent M (/) = ^(/ log /)-//(/) log //(/). (3.1) 
The log-Sobolev inequality (j3j 33J) says 

Ent M (/i 2 ) < 2c LS £(h, h), V/i G D(£), (3.2) 
where cls is the best constant, called log-Sobolev constant. It is well known that cp < cls- 

Theorem 3.1. Assume the log-Sobolev inequality Ii3.2\) . Let g G Lq(h) satisfy A(A) : = 

log / e Xg dfx < +oo for some A > 0. 
Jx 

Then the transportation-information inequality $2.6}) holds with 

M = inf i [c P A(X) + 2c LS ] < ^(A*)-^— ) (3.3) 

A>0 A Cp 

where A* : M + — > [0, +oo] is the Legendre transform of A and (A*) -1 is the right inverse. 
In particular Bernstein's inequality U.6\) holds with this constant M. 

Proof. We may assume that v = h 2 fx with < h G B(£). We have to bound the term 
B = f x g[h — fi(h)] 2 dfi in the decomposition (12. 7\i . Writing h = h — fi(h),I = I{v\n) = 
£(h,h), we have for any constant A > such that A(A) < +oo, J e X9 ~ a d\i = 1 where 
a = A(A) > 0, and then 

B = — (^J (Xg — a)h 2 dfi + a J h 2 dfi 

< - (Ent M (/?) + acpl 

< i [2c LS + A(X)cp] ■ I 

where the second inequality relies on Ent M (/) = sup 9 . A1 ( e9 - )<1 f x fgdfi (Donsker-Varadhan's 
variational formula) and the Poincare inequality, and the third one on the log-Sobolev 
inequality. Optimizing over A > yields ( 12. 6 j) with M given in (13. 3p . □ 

It is a surprise : the explicit estimate of M = M(g) above is not available even in the 
i.i.d. case under the exponential integrability condition. 

Let us give a more explicit estimate of M in the diffusion case. We assume that 

(H-p) (£, B(£)) is given by the carre-du-champs T : B(£) x B)(£) — > L 1 (/x) (symmetric, 
bilinear definite nonnegative form): 

£(h,h)= [ r(h,h)dfM, V/i eB(£). (3.4) 
Jx 

Diffusion framework. We shall assume that T is a differentiation (or equivalently the 
sample paths of (X t ) are continuous, — a.s., cf. Bakry j3]), that is: for all (hk)i<k<n C 
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B(S),g G D(£) and F G C£(R"), 

n 

r( J p(/ il , -.,/»„)^) = E ' ' ' ' ^) r (^> 

i=l 

Write :=T(/,/) simply. 

Corollary 3.2. Assume (Hr) and thatT is a differentiation. If the log-Sobolev inequality 
holds, then for any g G B(£) so £/ia£ T(o) zs bounded and fi(g) = 0, the transportation- 
information inequality ( QQ| ) /ioWs wnt/i 

M = 2c LS VcHir5)IU. (3.5) 

Proof. By Ledoux [33] or Bobkov-Gotze [10], in the actual diffusion case the log-Sobolev 
inequality implies that 

A(A) = log / e X9 dfi < ic^Alr^lU, VA > 0. 
Jx 2 

Plugging it into (J33D, we get M < 2c LS ^ c P \\Y(g)\\ 00 . □ 

Example 3.3. (Ornstein-Uhlenbeck processes) Let \x = Af(0,6), the Gaussian mea- 
sure with zero mean and variance 9 > on X = M, and Cf = f" — 6~ 1 x ■ f . It is well 
known that Cp = Cls — 

For every Lipschitzian function g with fi(g) = 0, ^/||r(p) [j^ = HVpHoo = ||<?IUip (the 
Lipschitzian coefficient w.r.t. the Euclidean metric). By Corollary 13 .2\ Bernstein's in- 
equality ( 11. 6p holds with M = 2cLS-\/cp\\g\\Lip = 2# 3(/2 ||g||Li p . It is worth mentioning that 
for the special observable g(x) = x, ( 12. 6 j) and then Bernstein's inequality ( II. 6p hold with 
M = (i.e. the corresponding Gaussian concentration inequality holds); and for general 
g with fi(g) = 0, 

v{g) < \\g\\ Lip Vm 

holds by [281 Proposition 2.9]. 

But by Theorem 13.11 for every /i-centered function g such that f e Sg dfi < +oo (for 
instance if g < C(l + |a;| 2 )), Bernstein's inequality ( II .6p holds with M = M(g) given in 
( 13. 3p . Though natural, that was not known before up to our knowledge. It is easy to see 
that Bernstein inequality is false for observable g{x) such that lim^oo = +oo. 

Let us look at the particularly interesting observable g{x) = go(x) := x 2 — 6 for which 
we can get sharp Bernstein inequality. Indeed since — Cgo = —26~ 1 go, 

a 2 (g ) = 2((-C)- 1 g 0l g ), = flVar M (o ) = 26 3 . 

On the other hand observe that for each real number a < |, U(x) := exp \^fj G £ 2 (/i), 
and 



2 

a — a 



£+—^g 



a 2 

U = jU. 



2 

In other words U is a positive eigenfunction of the Schrodinger operator C + ^-^-go 
associated with eigenvalue a 2 /8, which implies that (by Perron- Frobenius theorem and 
Rayleigh's formula) 

a 2 \ a 2 1 



A I -^^9o ) = j, a < -. 
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Hence for all A < A := ^2, taking a — a_ := \ (l — yl — 46> 2 A) < 1/2, we have 

A(A^ ) = ^(i-v / T^4^a) 2 

Since A — > A(Xg ) from R to (—00, +00] is convex and lower semi-continuous, and its left 
derivative at A is +00, we conclude that 

A(A) := A(Aso) = ^ (l - if A < A = ^; +00, if A > A . (3.6) 

From the previous explicit expression we obtain (by the fact that the geometric mean is 
not greater than the arithmetic mean) 

A(A) " 2(i[l + Vl-4^Ap " 2(1 -40*A)' A ( °' Ao) 

where it follows that go(x) = x 2 — 9 satisfies the Bernstein inequality (II. 6ft with the sharp 
constant M = A9 2 . 

Notice that (13. 6p will give, by Theorem I2.2| the concentration inequality for the esti- 
mator j J* X 2 ds of 9, which is not only sharp for the CLT and moderate deviation scales, 
but also for large deviations. 

3.2. $-Sobolev inequality. Let $ : M + — > [0, +00] be a Young function, i.e. a convex, 
increasing and left continuous function with $(0) = and lim x _ s>+00 $(x) = +00. Consider 
the Orlicz space L*(//) of those measurable functions g on X so that its gauge norm 

N*(g) := inf{c > 0; J ^{\g\/c)d^ < 1} 
is finite, where the convention inf := +00 is used. The Orlicz norm of g is defined by 

||g||$ := sup{ / gud/i; Ny(u) < 1} 



where 

*(r) := sup(Ar - $(A)), r > (3.7) 

A>0 

is the convex conjugate of It is well known that ( [4"3l Proposition 4, p. 61]) 

N*(g) < \\gU < 2N*(g). 

The $-Sobolev inequality says that 

\\(h- n(h)) 2 \\$ < c Pt *£(h,h), VheB(E) (3.8) 

called sometimes Orlicz- Poincare inequality, where cp^ is the best constant. There is a 
rich theory of long history for this subject, see [TTJ [331 1^9] - 

Set $(s) := $(x 2 ), x > and let \l/ be the Legendre transform of 

Lemma 3.4. Assume the &-Sobolev inequality A3.8\) . If g G L (fi) so that fi{g) = 0, then 
J^g(X s )ds E L 2 (P M ) and it holds that 

cr 2 (g) = lim jVarp ( / g(X s )ds] < c Pi *\\g\\%. (3.9) 



12 FUQING GAO, ARNAUD GUILLIN, AND LIMING WU 

Moreover 

(g, h)l < ^a 2 (g)S(h 1 h),Vhe B(S). (3.10) 

Proof. At first for g G Lq([i), notice that by the spectral decomposition and Cauchy- 
Schwarz, 



V (9, {-£) 1 g) l x = sup (g, h) 



heO(£),£(h,h)<l 

and 

\(9>h)i*\ = \(9,h-K h ))i*\ ^ \\9hN$(h- fi(h)). 
Furthermore by the $-Sobolev inequality (|3.8p . 

N$(h - n(h)) = V / MH#) < Vll^-MWII* < ^c P ,*£(h,h) 

therefore 

(<?, (-/I)" 1 ^ < cp^|b|||, <? G Lg(^). (3.11) 
Now take a sequence (g n ) in L™([i) converging to g in L*(//), we have for any £ > 0, 

^Var Pfi Qf (# n - 5- m )(X s )rfs^ < o 2 (g n - g m ) = 2(g n - g m , (-C)~ l (g n - g m ))^ 

< 2cp^\\g n — g m \\\. 

This implies not only "f Q * g(X s )ds G L 2 (P M )" but also (|53)l . The last claim f l3TT0|) holds 
for g ra in place of (7 then remains true for g by letting n — > 00. □ 

Theorem 3.5. Assume the Q-Sobolev inequality $3. 8\) and let ^ be the convex conjugate 
0/ $ given above. If g G L (fi) and g + G L* (//) with fi(g) = 0, then the transportation- 
information inequality $2. 6]) holds with a 2 = cr 2 (g) given by A3. 9\) and 

M = N^(g + ) ■ cp 5 $. (3.12) 

In particular Bernstein's inequality A1.6}) holds with that constant M. 

Proof. The proof is even easier than that of Theorem 13.11 For (12. 6p we may assume that 
v = h 2 [i with < h G B(£). By Lemma I3"^fl a 2 = c 2 (g) given by (13. 9p is finite. The term 
A in (12.61) is bounded by \/2a 2 I by (13. 101) . For the term B = f x g[h — ii{h)} 2 djjL we have 

B < N*(g + )\\[h - fi(h)]% < c P ^(g + )I 

where the desired result follows. □ 

Remark 3.6. When $(x) = \x\, = +00 • l x>1 , Ny(h) = ||/i||oo- Then this result 

generalizes Proposition 12.51 

Remark 3.7. For one-dimensional diffusions, an explicit necessary and sufficient condi- 
tion for the $-Sobolev inequality (13. 8 p is available, see the book of M.F. Chen [17J. For 
$-Sobolev inequality in high dimension, see the book of F.Y. Wang [19] for numerous 
known results. 
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Example 3.8. As a well known fact (see Saloff-Coste [45J), for the Brownian Motion (B t ) 
on a compact connected Riemannian manifold M of dimension n with the invariant mea- 

dx 

sure n given by the normalized Riemannian measure y^j^^ (where V(M) is the volume 
of M), the Dirichlet form J \ Vf\ 2 dfi satisfies the $-Sobolev inequality (13. 8ft with 

+oo/(i j0o) (|t|), if n = 1, 



$(t) 



exp(C|t|) - 1, if n = 2, 



2n 
\t ™" 2 - 



if n > 3. 



Hence Bernstein's inequality ( II. 6p holds for g £ £q(/u) satisfying 

L 1 (/i), if n = 1, 



5 e < 



LMogL 1 , if n = 2, 



L^2(fi), if n > 3. 

Those still hold for diffusion generated by A — W ■ V with C 2 -smooth function V on a, 
connected compact manifold. 

GXJ) ( — I OC I 1 

Example 3.9. Consider the measure np(dx) = (where Zp is the normalized 

constant), and > 1. For the diffusion process corresponding to the Dirichlet form 
(—Cf, f)^ = J | V f\ 2 dfj,, it satisfies $-Sobolev inequality (13. 8p with 

$ a (z) = xlog Q (l + x), a = 2(1 - 

according to Barthe, Cattiaux and Roberto (HI section 7]. Hence Bernstein's inequality 
( II. 6p holds for g £ Lq(/x) satisfying 



exp (A(^ H 



^/(2/3-2) 



) dfi < +oo, for some A > 0. 



(3.13) 



Those two examples show that for Bernstein's inequality to hold, the integrability 
condition on the observable g in the continuous time symmetric Markov processes case 
may be much weaker than the exponential integrability condition in the i.i.d. case. 



4. Lyapunov function method 

Sometimes functional inequalities are difficult to check. In that situation the easy-to- 
check Lyapunov function method will be very helpful. 

4.1. General result. A measurable function G is said to be in the /z-extended domain 
I^e,/z(£) of the generator of the Markov process (pf^P^) if there is some measurable 
function g such that J Q \g\(X s ) ds < +oo, P^-a.s. and one P^-version of 

M t (G) := G(X t ) - G(X ) + f g{X s )ds 

Jo 
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is a local P^-martingale. It is obvious that g is uniquely determined up to yU-equivalence. 
In such case one writes G G D e /i (£) and — CG = g. When the above properties hold for 
F x instead of P M for every a; £ A", we say that G belongs to the extended domain D e (.£). 
In the latter case — CG = g is determined uniquely up to J °° e _ *P t (x, ■) (it-equivalence for 
every x G X. 

The Lyapunov condition can be stated now : 

(Hi) There exist a measurable function U : X — > [1, +00) in D e /J (£), a positive function 
4> and a constant 6 > such that 

> <p — 0, /z-a.s. 



When the process is irreducible and the constant b is replaced by bio for some "small set" 
C, then it is well-known that the existence of a positive bounded 4> such that infx\c > 
in (Hi) is equivalent to Poincare inequality (see [HE], for instance). 

Lyapunov conditions are widely used to study the speed of convergence of Markov chains 
[41] or Markov processes [24"l 123] . large or moderate deviations and essential spectral radii 
[5U [27J EH] or sharp large deviations [32]. More recently, they have been used to study 
functional inequalities such as weak Poincare inequality [5] or super-Poincare inequality 
[16] • See Wang [19] on weak and super Poincare inequalities. 

For a given function /, let K^f) G [0, +00} be the minimal constant C G [0, +00] such 
that (/I < C4>. 

Theorem 4.1. Assume the Lyapunov function condition (H L ). For g G Lq(h), ifK ( f > (g + ) < 
+00, then the transportation-information inequality 112.6}) holds with 

M = K^(g + ) (bc P + 1) . (4.1) 

In particular Bernstein's inequality U.6]) holds with that constant M. 

Proof. We are inspired by the elegant proof of Barthe-Bakry-Cattiaux-Guillin [4] for the 
Poincare inequality. As before let v = h 2 fi with < h G B(£). For the term B = 
fx g[h — fi(h)] 2 dfi in (12. 6p we have by (Hi), 

B < IU(g + ) J cj>[h - ^(h)] 2 dfi < K^g + ) jU-^-\[h- fi(h)] 2 dfM. 

By a result in large deviations [28] Lemma 5.6], we have 

\ - — [h-fx(h)} 2 dfx<£(h,h) = I. 
Jx U 

Hence applying the Poincare inequality, we get 

B < K^) (bc P + 1)1 
the desired result. □ 



4.2. Particular case : diffusions on R . Let X = R , x ■ y and |x| = y/x ■ x be the 
Euclidean inner product and norm, respectively. Consider C = A — VV • V on R d , where 
V is lower bounded C 2 -smooth such that Z = L d e~ v dx is finite. The corresponding 
semigroup P t is symmetric on L 2 (jj) for ji = jpe~ v dx. From Theorem 14.11 we derive easily 
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Corollary 4.2. In the framework above, let 7 > be some fixed constant. If one of the 
following conditions 

3a < l,R,c> 0,such that if |x| > R, (1 - a)|W| 2 - AV > c (1 + |x| 7 ) (4.2) 

or 

3R, c> 0, such that Vbl > R, \x\ l/2 ^r ■ W(x) > c (1 + (4.3) 

\x\ 

is satisfied, then the Lyapunov function condition (H L ) is satisfied with <f>(x) : = c(l + |x| 7 ) ; 
and then for any \x- centered function g such that g(x) < (7(1 + |x| 7 ), Bernstein's inequality 
U.6\) holds for some constant M = M(g) given by U^T\). 



Proof. Under (14.21) . one takes U = e aV ; and under (14.31) one choose U = e a l x l 1+(7/2> with 
small enough a > (so that c may be arbitrary). One sees that condition (i?r) is satisfied 
in both cases. □ 

Example 4.3. Let V(x) = \x\@ (f3 > is fixed) for \x\ > 1 in the framework above. 

Case 1. /3 G (0, 1). In this case the Poincare inequality does not hold (cf. |33j). And 
Bernstein's inequality ( II. 6p does not hold for all g G bB (with /j,(g) = 0) as explained in 
the Introduction. Section 6 is devoted to such examples. 

Case 2. (3 = 1. For this exponential type's measure /x, the Poincare inequality holds 
and one can apply Lezaud's result for bounded g. We do not believe that the Bernstein's 
inequality holds for unbounded g. 

Case 3. > 1. Condition (14. 3 j) is satisfied with 7 = 2(/3 — 1). Hence Bernstein's 
inequality (II. 6p holds for /i-centered g such that g < (7(1 + |x| 2 ^ -1 ^), in concordance with 
condition (13. 13j) in Example 13.91 

4.3. Particular case : birth-death processes. Let X = N and 

Cf(k) = b k (f(k + 1) - f(k)) + a k (f(k - 1) - f(k)), k e N 

where b^ > 0, k > are the birth rates, a k > 0, k > 1 are the death rates respectively, 
and /(-l) := /(0). 

We assume that the process is positive recurrent, i.e., 

+00 

^ 7r„ ^2(n i b i y 1 = 00 and (7 := 7r„ < +00, 

n>0 i>n n=0 

where ir n is given by 

• ■ ■ K-i ^ 

TTq = 1, 7T„ = , n > 1 

(21(3,2 • • ■ a n 

is an invariant measure of the process. Define the normalized probability /x of 7r by \x n = ^ 
for any n > 0, which is actually the unique reversible invariant probability of the process. 

Corollary 4.4. Given a positive weight function <f) on N such that <p > 5 > 0. If there 
are some constant k > 1 and some iV > 1 so that 

a n - k6„ > o (n), n > xV, (4.4) 

then (H-p) holds with (f>(n) := (1 — K~ 1 )<p (n) (and some finite constant b). In particular 
the results in Theorem\4-l\ holds true. 
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Proof. Let U(n) = n n , we have 



~' U -{n) = - — - (a n - Kb r , 



U V ' K 

where it follows that cp < +00 ([HE]) and so the desired result holds by Theorem 14. II □ 

Example 4.5. (M/M/oo-queue system) Let b k = A > (k > 0) and a k = k (k > 1). 
Then /1 is the Poisson distribution with parameter A. It is an ideal model for a queue 
system with a number of serveurs much larger than the number of clients. It is well known 
that cp = 1 but the log-Sobolev inequality does not hold ([53]). 

For 4>o{ n ) — n + 5 where 6 > is fixed, taking U(n) = K n (k, > 1) as above and applying 
Theorem 14.11 we get by an optimization over k > 1 that for all g so that g < Kin + 5) 
(K > 0), B < MI where 

M = K[{V\ + l) 2 + 5}. (4.5) 

Hence (I2.6P and Bernstein's inequality ( II. 6p hold with such M. Notice that the growth 
of M for large A is linear in A. 

An important observable is ^o(^) = n — A (then L t (go) is the difference between the 
mean number of clients in the queue system during time interval [0, t] and the asymptotic 
mean A). Since (-Cy 1 g = g , we have a 2 (g ) = 2((-£)- 1 g , g ) ^ = 2V&i^(g ) = 2A. 
We want to get a better estimate of M = M(g ). 

For U{n) = n n (k > 0), we have 



£H 9o 



In other words < U G L 2 {ji) is an eigenfunction of the Schrodinger operator C + ^^go 
with eigenvalue ^~ 1 - > A. By Perron-Frobenius theorem and Raylaigh's principle, 

\2 



K — 1 \ (K—l) 

A g 



K 



Thus if s < 1, 



A( 39o) = = gf2H (4.6) 

1 — s 2(1 — s) 



and then A(sg ) = +°° f° r all s > 1 (by the convexity of s — > A(sgo)). 

By Theorem 12.21 for g — go, riot only the Bernstein inequality (jl.6p holds with the 
optimal constant M(go) = 1, and this inequality is itself sharp : indeed ( 14 .6p implies by 
Proposition 12. II and the large deviation lower bound in Wu J52J Theorem B.l], 

1 /l /■* \ r 2 
lim - log P w - / X s ds > X + r) = ^ r > 0. 

1 [ * !° ) a( v ^+i+i) 2 ' 



t^oo t \t 



The calculus above shows that the mean number of clients | J * X s ds does not possess any 
Poisson type's concentration inequality, contrary to the intuition that one might have for 
this standard process related with the Poisson measure. 
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5. A LlPSCHITZIAN APPROACH 

In this section we assume always the existence of the carre-du-champs operator T, i.e. 
(H-p) in §3. We suppose furthermore that T = r + r! where T k : ID)(£ 2 ) — > k = 0,1 

are both bilinear nonnegative definite forms, T is a differentiation, ri is given by 

Ti(f,g)(x) = \ J(f(y) - f(x))(g(y) - g(x))J(x,dy), f,g G D(5). 

Here T corresponds to the continuous diffusion part of {X t ), and J(x, dy) is a nonnegative 
jumps kernel (maybe a-infinite) on X such that J(x,{x}) = and fi(dx)J(x,dy) is 
symmetric on X 2 , describing the jumps rate of the process. 

5.1. General result. Recall that r(/) = r(/,/). 

Theorem 5.1. Assume that d is a lower semi- continuous metric on X (which does not 
necessarily generate the topology of X), such that J x d(x,xo) 2 dfi(x) < +oo. Given g G 
Lq(/i), let G G Lq(h) P)D2(£) be the unique solution of the Poisson equation —CG = g. 
If ||r(C) ||oo < +oo, then the transportation-information inequality Ii2.6]) holds with 



M = 2 > /cp||r(G)||oo. (5.1) 
In particular Bernstein's inequality U.6}) holds with that constant M . 

Proof. As before we may assume that v = h 2 fx with < h G D(£) f| L°°(fi). For the term 
B = J x g[h — fi(h)] 2 dfi in (12. 7p . setting h = h — fi(h) we write 

B = (-CG,h 2 )^ = [ T (G,h 2 )dfi+ [ T 1 (G,h 2 )dfi. 
Jx Jx 

For the r -term, we have 

/ T (G,h 2 )dfi< ! ^T (G)T Ch 2 )dfi = 2 j yjT (G)h?T (h)d(j. 

The Tx-term above requires some more work. We proceed as follows. 

r x (G, h 2 )d^ = \ ff (G(y) - G(x))(h(y) + h(x))(h(y) - h(x))fx(dx)J(x, dy) 
x z JJx' 2 



<2y //(dx)y J (h(y) — h(x)) 2 fi(dx)J(x,dy) 

~ f (G(y) - G(x)) 2 [h(y) + h(x)} 2 i2(dx)J(x } dy). 

Plugging those two estimates into the expression of B above, we get by Cauchy-Schwarz's 
inequality, 

B<2\l I T,{G)h 2 d^+\ I I (G( !J )-G(r))^,( !J ) + l,(r)^ ll Ul.r)J(,, ( h J ) 
x ° Jx Jx 




(r (/i) + ri(A))d/i. 

I X 
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The last factor is yl. Using the symmetry in (x, y) of p(dx)J(x 1 dy) and (a + b) 2 < 
2 (a 2 + b 2 ) , the second term inside the first square root above can be bounded by 

1 



(G(y) - G{x)) 2 [h{yf + h(x) 2 }p(dx)J(x, dy) 



(G(y)-G(x)Yh(xy^(dx)J(x,dy)= / T 1 (G)(x)h(xY p(dx). 
2 J Jx 2 Jx 

Hence the sum inside the first square root above is not greater than f x T(G)(x)h(x) 2 p(dx) 
Thus we obtain 



B = j g[h- p(h)} 2 dfi< 2^ j T(G)(x)h(x) 2 n(dx)-y/l. (5.2) 
Now noting that f x T(G)(x)h(x) 2 p(dx) < ||r(G)|| 0O Var M (/i) < cp||r(G)|| 00 7, we conclude 



that B < 2^/cp\\T(G)\\ 00 I, the desired result. 



□ 



Some sharp estimates of ||r((jr)||oo for diffusions are available : see Djellout and Wu 
[19] for one dimensional diffusions, and Wu [56] for elliptic diffusions on manifolds. Here 
we present examples of jumps processes. 

5.2. Birth-death processes continued. The following two lemmas are taken from Liu 
and Ma [35]. 

Lemma 5.2. Given a function g on N with p(g) = 0, consider the Poisson equation 

-CG = g. (5.3) 
For any k > 0, the solution of the above equation 115. 3\) satisfies the following relation : 



G(k + 1) - G{k) 



(5.4) 



Lemma 5.3. Let p : N — > 1R be an increasing function in L 2 (p). Provided that \ \g\\up(p) '■= 
sup fceN l f^Xi)-l\k) = 1 mth Kq) = 0, we have for any k > 0, 

Y ^tf(i) ^ Y toW) - Mp))- (5-5) 



i>k i>k 

We can derive easily 
Corollary 5.4. Let p : N — > M be an increasing function in L 2 (p). If 



K := ^sup [ l n >i — 

* n>0 \ «nP n 



%>n 



Y -Mp)) 



i>n+l 



(5.6) 



is finite, then for every g with p(g) = and \\g\\Lip( P ) < +oo, the transportation inequality 
112. 6\) holds with M = 2y/cpK\\g\\ Lip ^ p y 

Proof. By Lemmas 1 5 . 2 1 and I5T3"| the solution G of — CG = g satisfies ^(G)^ < i^||5'||| ip ( p ) 

□ 



(using a n+ ip n+ i = b n p n ). It remains to apply Theorem [5TT 
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See [35J for convex concentration inequalities. Though we can give many examples to 
which Corollary 15.41 applies, we want to look at the M/M/oo queue system again. 

Example 5.5. (M/M/oo queue, continued) The constant K in (15.61) above is infinite 
for p{n) = n, but finite for p{n) = J^)L l/\/k + 1 (a quite artificial choice). What 
happens for p{n) = po{n) := n ? (In that case ||gi|Lip(p ) =: IMUip * s the Lipschitzian 
coefficient w.r.t. the Euclidean metric.) 

A crucial feature of this model is the commutation relation DP t = e~ l P t D where 
Df(n) := f(n + 1) — f{n), a property shared by Ornstein-Uhlenbeck process for D — V. 
From this fact one sees that 

\\{-£)~ X g\\Li P < IMI Lip- 
Then if 1 1 <? 1 1 Lip < 1, G — (—£)~ 1 g satisfies 

r(G)(n) = X - (X[G(n + 1) - G(n)} 2 + n[G(n - 1) - G(n)] 2 ) < ^(A + n). 
Applying (I5.2p in the proof of Theorem 15.11 we get by f)4.5p 

B < ^2 J (\ + n)h 2 p(dn)Vl < ^2[(\/X + l) 2 + A] /. 
Thus we have proven 

Corollary 5.6. For the M/M/oo queue, if the Lipschitzian norm \\g\\i4 P of g w.r.t. the 
Euclidean metric is finite (and p(g) = 0), then A2.6\) and Bernstein's inequality U.6\) hold 
with 

{Vx + iy + x . 



M 



\g\\Lip 



6. The subgeometric case 

6.1. General result. In this last section, we will suppose no more that a Poincare in- 
equality holds, and inspired by the Lyapunov function approach, we introduce a more 
classical version of Lyapunov condition 

(Hlc) there exist a continuous function U : X — >■ [1, +oo) in D e ^(£), a measurable 
positive function <fi, a set C e B with p(C) > and constant b > such that 

CU 

— jj- ><j>- blc, /i-a.s. 

In our mind <fi goes to at infinity in this section. 

We will also assume that a local Poincare inequality holds for the set C in (Hlc)'- there 
exists some constant such that for all g E B(£) such that p(glc) = 

Ms'lc) <K C £(g,g). (6.1) 

Note that for diffusions on M. d , C is often a ball B(0,R) and the local Poincare inequality 
may then be easily deduced from the local Poincare inequality for the Lebesgue measure 
on balls by a perturbation argument. 
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Theorem 6.1. Assume the Lyapunov function condition (Hlc) and the local Poincare 
inequality ( Iff. 1\) for the set C. For g G £q(/-0 suc ^ that a 2 (g) is finite, if K^g^ < +00, 
then the transportation-information inequality $2. 61) holds with 

M = K^g + ) (bK C + 1) . (6.2) 

In particular Bernstein's inequality M.6\) holds with that constant M . 

Proof. In fact we have to slightly modify the key approach described in section 2: for a 
constant c > to be chosen later, 



via) 



gh 2 dfi = / g([h — c] 2 + 2ch)dfi 
Jx 

2c(g,h) fl + / g[h~c\ 2 dfi=: A + B. 
Jx 



(6.3) 



For the first term A = 2c{g,h) fl , since a 2 = o~ 2 (g) is assumed to be finite, we have by 
Remark E31 that \A\ < c\/2a 2 I. 
Let consider now the second term 

B = Jj[h- c} 2 dfi < j x 9 + [h~ c] 2 dfx < K^g + ) jf (bl c - jpj [h - c] 2 d^. 

By a result in large deviations [25] Lemma 5.6], we have 

/ -^[h-c} 2 dv<£{h,h) =1. 
Jx U 

For the other term we apply the local Poincare inequality, valid if we consider c = n(hlc) 
which leads to 

B<i^(# + )(&K C + 1)/- 
Remark finally that c = fi{Jilc) < 1- D 

Now we present an easy sufficient condition for the fmiteness of c 2 (g) (and then for the 
CLT by Remark 12.41) by following Glynn and Meyn [25], which has its own interest. 

Lemma 6.2. Suppose that R± = J °° e~ t Ptdt is /i -irreducible (i.e. /i Ri(x, •) for every 
x G X) and Harris positive recurrent ([41] ). Assume that there are 

• a (Lyapunov) continuous function W : X — > [1, +00) in the extended domain D e (£) 
(see 14.1), 

• a measurable function F : X — > (0, +00) , 

• a Ri- small set C with //(C) > 0, i.e. Ri(x, A) > 5u{A) for all x G C,A G B for some 
constant 5 > and v G AAi(X), 

• and a positive constant b 
such that W is bounded on C and 

CW<-F + bl c . (6.4) 

If \g\ < °F f or some constant c > and fi(g) = 0, then 

(1) There exists some measurable function G such that \G\ < cW for some constant 
c > 0, such that for any t > 0, J Q P s \g\ds < +00 and P t G — G = — J P s gds 
everywhere on X (in such case we say that G belongs to the extended domain in 
the strong sense D s (£) of £ and write —CG — g). 



BERNSTEIN TYPE'S CONCENTRATION INEQUALITIES 



21 



(2) If furthermore g G L^{p) and W G L q (p) where p G [2, +oo] and l/p + 1/q = 1, 
then o~ 2 (g) is finite. 

Its proof is postponed to the Appendix. 

6.2. Particular case: diffusions on M. d . We study here the diffusion in M. d with gener- 
ator C = A — W ■ V and p, = e~ v dx/Z, presented in Section 4. The first thing to remark 
is that any compact set is a small set, and thus balls are small sets. A local Poincare 
inequality is then available. We then have 

Corollary 6.3. Suppose that there exists a positive and bounded function <j> such that 

3a < l,R,c> 0,such that if \x\ > R, (1 - a)|W| 2 - AV > <j>(x). (6.5) 

Then the weak Lyapunov condition {Hlc) is satisfied with U = e aV with <ft = a<ft and 
C = B(0,R); and if J e^ a ~^ v dx < +oo (i.e. p,(JJ) < +oo), then for any \i centered 
bounded function g such that \g\ < ci<pU and g(x) < C2<fi for some positive constants 
Ci,c 2 , the asymptotic variance cr 2 (g) is finite by Lemma [ffJ^l and Bernstein's inequality 
holds. 

Note that, in parallel to the second condition of Corollary 14.21 one may also consider 
Lyapunov function of the form C/(|a;|), but the result is then not as explicit and we prefer 
to illustrate such an approach through examples. 

Example 6.4. (sub-exponential measure) Let V(x) = \x\& (if \x\ > 1) for G (0,1) 
such that no Poincare inequality holds. However, one may apply the previous corollary 
with U(x) = e a W P and 0(x) = (1 - a - 5)/3 2 (l + |a,|)2(/3-i) ( a> 5 e(Q,l), a + 5 < 1). Hence 
by Corollary 16.31 Bernstein's inequality holds for p, centered bounded function g such that 
for large |x|, g(x) < c/(l + |x|) 2 ^ 1_/3 ^. 

Example 6.5. (Cauchy type measure) Let V(x) = \{d + /3) log(l + |x| 2 ) for (3 > 0. 
The condition (H LC ) holds with U = e aV = (1 + | x | 2 )«(rf+/3)/2 and = C/ /^ + 
for some constant c > 0, where a G (0, 1) so that (1 — a)(d + (3) > d (for fi(U) < +oo). 
So Bernstein's inequality holds for \x centered bounded function g such that for large \x\, 
g(x) < K/(l + \x\ 2 ) for some constant K > 0, by Corollary 16.31 

Remark 6.6. One may be surprised that the upper bound for the test function is the 
same for every Cauchy type measure. One may find the beginning of an answer in recent 
results of Bobkov-Ledoux |Tl] (see also Cattiaux-Gozlan-Guillin- Roberto [13]). Indeed, 
in their work they prove that this type of measures satisfy a weighted Poincare type 
inequality where the weight is the same for every Cauchy-type measure. 

6.3. Particular case : birth-death processes. We adopt here the notations of sub- 
section 4.3, and assume once again that the process is positive recurrent. We suppose for 
simplicity that for large enough n, the death rate a n is larger than the birth rate b n . 

Corollary 6.7. If there are m > 0, N > 1 and a positive sequence (c n ) n£ N such that 

(1) for all n > N , a n — b n > c n > 0; 

(2) £ n n> n <+oo ; 

then Bernstein's inequality is valid for every \i centered bounded function g such that for 
large n, \g{n)\ < cn m ~ l c n and g(n) < Kc n /n for some constants c,K>0. 
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Proof. Let U(n) = (1 + n) m , then for large n, 



CU(n) ^ , , N / 1 , 1 , 
> m(a n - b n ) - + o(- 



Z7(n) \n n' 

Hence the Lyapunov condition (Hlc) holds for 0(n) = (m—5)c n /(l+n) where 5 G (0,m). 
The local Poincare inequality is always valid in this context and a precise estimation of 
the constant may be found in Chen [IT] . Since n(U) is finite, we can apply Lemma [6.21 
to conclude that cr 2 {g) < +oo for \g\ < c(j)U. It remains to apply Theorem 16. II 

□ 

Example 6.8. Let b n = 1 and a n — 1 + a/(n+ 1) where a > 0. Then c n := a n — b n = 
a/{n + 1) and n n behaves as \ for large n. Thus the process is positive recurrent if and 
only if a > 1. For a > 1, take m G (0, a — 1), we see that the conditions in Corollary 16. 71 are 
all satisfied. Hence Bernstein's inequality holds for //-centered g such that |<?(n)| < K/n 2 
for large n. This is quite similar as in the Cauchy measure case. 

7. Appendix 

Proof of Lemma 1 6. 21 Let us first prove part (2) by admitting part (1). Let G be the 
strong solution of — CG = g given in part (1). Since W G L g (fi), considering G — 
if necessary we may and will assume that = 0. Now for any e > 0, let R £ = 

J °° e~ £t P t dt — (e — C)~ l be the resolvent. By the resolvent equation, G — R £ g = eR e G 
which tends to ^(G) = in L q (fi) as e — > by the ergodic theorem, we have 



tim(Rs9,g)» = J Ggdfi < +oo. 



This relation yields that o~ 2 (g) in (11.71) exists and o~ 2 (g) = 2 J Ggdfi (in the actual sym- 
metric case). 

We turn now to prove part (1). This is due to Glynn and Meyn (251 Theorem 3.2] when 
F is bounded from below by a positive constant. Let us modify slightly their proof for 
the general case. 

Step 1 (Reduction to the discrete time case). At first since e~ bt W(X t ) is a local 
super-martingale, then a super-martingale, so PtW < e bt W for all t > 0. Moreover for 
any A > 0, by Ito's formula, 



M t = e- xt W{X t ) - W(X ) + I e- Xs (\W - CW) (X s )ds 

Jo 

is a P^-local martingale for every x G X. Hence taking a sequence of stopping times (r n ) 
increasing to +oo such that K x M Tn = 0, we have for every x G X , 

W [ " e- Xs (\W + F- bl c ) (X s )ds < E x [ " e" As (\W - CW) (X s )ds < W(x). 
Jo Jo 

Letting n go to infinity, we obtain by monotone convergence 

XR\W + R\F <W + bR x l C - 
Consider the Markov kernel Q = R\. The relation above says that 

QW <W - QF + bQl c . (7.1) 
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Assume that one can prove that there is G such that \G\ < cW (for some constant 
c > 0) such that 

(l-Q)G = Qg. (7.2) 
Then G = Ri{G + g) G D a (£) and R X {-C)G = (1 - RJG = R x g. Consequently 
— CG = (1 — C)R\{— C)G = (I — C)R\g = g, the desired claim in part (1). 
Therefore it remains to solve (17.2R under the condition ( 17. ip . 

Step 2 (atom case). Let us suppose at first that the small set C in (17.11) is an atom 
of Q, i.e., Q(x, ■) = Q(y, •) for all x,y G C. In this case one solution to (17.21) is given by 

G{x)=wY J Q9iXk) (7.3) 

fc=0 

where (Y n ) n > is the Markov chain with transition probability kernel Q defined on (Q, (J-'n), < 
equipped with the shift 9 (so that Y n (6u) = Y n+ i(u)), a c = mi{n > 0; Y n G C}. 
To justify this fact which is one key in [25J, notice 

1) G given by (I7.3P is well defined. In fact \Qg\ < cQF. Using the condition (17.11) and 
the fact that 

rc-l 

W(Y n ) - W(Y ) + J2( w ~ QW)(Y k ) 

k=0 

is a Q^-martingale, we obtain the following at first for ac An and then for oq (by letting 

n — > oo) 

E x QF{Yk)<bW <?lc(n) + W(x) 

0<k<o c ~l 0<fc<<r c -l 

= bE x l c( Y k) + W{x) <b + W 

l<k<a c 

where the second equality for oc An (instead of a c ) follows by Doob's stopping time 
theorem. Consequently 

(T£—l 

W Q F ( Y k) < supQF(x) + E X J2 Q F ( Y k) < supQF(x) + b + W(x). 

0<k<cr c xGC k=0 x£ ° 

By (J7ID, QF < W + b is bounded on C. Therefore G is well defined and \G\ < c{V + W). 

2) Let t c := inf{n > l;Y n G C}. We have dc-o^ = t c — 1 on [cr^ = 0] and er c o# = a c — l 
on [ere > 1]. Hence for x G C 



U C oO T C 

k=0 k=l 

which is constant on x G C and equals to /^(g) / fi(C) = 0, then G(x) — QG(x) = G(x) = 
Qg(x) for x G C. Now for x C, 

QG{x) = W Qg(Y k +i) = E x Q9(Yk+i) = G(x) - Qg{x). 

k=0 k=0 

So G — QG = Qg everywhere on X. 

Step 3 (non-atom case). In the non-atom case one can consider the splitting chain 
in [25l Proof of Theorem 2.3] to reduce the problem to the atom case. □ 
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